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Abstract 

We  describe  a  general  multi-stage  stochastic  integer-programming  model  for  planning  discrete 
capacity  expansion  of  production  facilities.  A  scenario  tree  represents  uncertainty  in  the  model. 
Variable  splitting  leads  to  two  forms  of  this  model:  the  first  allows  multiple  expansions  of  each 
facility  over  the  planning  horizon  while  the  second  allows  at  most  one.  Dantzig- Wolfe  decom¬ 
position  of  either  split-variable  model  results  in  a  binary  master  problem  that  solves  easily,  as 
its  linear-programming  relaxation  tends  to  yield  integer  solutions.  For  each  scenario-tree  node, 
the  decomposition  defines  a  subproblem  that  may  be  viewed  as  a  single-period,  deterministic 
capacity-expansion  problem.  An  effective  solution  procedure  results  as  long  as  the  subproblems 
solve  efficiently,  and  the  procedure  incorporates  a  good  “duals  stabilization  scheme” .  We  present 
computational  results  for  a  model  to  plan  the  capacity  expansion  of  a  real-world  electricity  dis¬ 
tribution  network  given  uncertain  future  demand.  The  largest  problem  we  solve  to  optimality 
has  6  stages  and  243  scenarios  corresponding  to  a  deterministic  equivalent  with  a  quarter  of  a 
million  binary  variables. 
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1  Introduction 

Research  from  as  early  as  the  1950s  (Masse  and  Gibrat  1957)  suggests  that  effective  capacity  plan¬ 
ning  for  industrial  facilities  must  treat  uncertainty  explicitly.  The  list  of  uncertain  parameters  can 

include  demands  on  those  facilities,  expansion  costs,  operating  costs,  and  production  efficiencies. 
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This  paper  studies  capacity-planning  problems  in  which  a  sequence  of  discrete,  capacity-expansion 
decisions  must  be  made  over  a  finite  planning  horizon,  subject  to  one  or  more  sources  of  uncer¬ 
tainty.  A  deterministic,  single-period  instance  of  our  model,  without  capacity-expansion  decisions, 
can  be  viewed  as  an  operations-planning  model  for  a  “system” ,  which  might  represent  a  single  plant 
with  multiple  production  facilities,  each  of  which  has  a  fixed  production  capacity  and  manufactures 
multiple  products.  Given  production  costs  and  known  product  demands,  the  system  manager  must 
identify  a  minimum-cost,  capacity-feasible,  operational  plan  to  meet  those  demands.  Even  this 
single-period,  deterministic  problem  can  be  complicated,  as  it  may  require  a  high  level  of  modeling 
fidelity  that  incorporates  both  continuous  and  discrete  decision  variables.  However,  the  full  planning 
problem  spans  a  multi-period  horizon,  must  incorporate  capacity-expansion  decisions  to  accommo¬ 
date  demand  growth,  and  faces  uncertainty  in  demand,  costs  and  possibly  other  parameters.  An 
optimal  capacity-expansion  plan  will  (1)  enable  production  to  meet  demand,  and  (2)  minimize  the 
expected  costs  of  capacity  expansion  plus  production  over  the  planning  horizon. 

The  stochastic  capacity-planning  problem  can  be  formulated  as  a  multi-stage,  stochastic, 
mixed-integer  program  that  minimizes  the  expected  discounted  costs  of  capacity  expansion  and  facil¬ 
ity  operations.  We  represent  uncertain  parameters  using  a  standard  scenario  tree  (e.g.,  Ruszczyriski 
and  Shapiro  2003,  pp  29-30).  Given  a  finite  number  of  scenarios  and  their  probabilities,  this  problem 
can  then  be  recast  as  a  large-scale  mixed-integer  program,  i.e.,  a  “deterministic  equivalent”,  that 
can  be  solved,  in  theory,  by  a  commercial  optimization  code.  As  we  shall  see,  however,  only  the 
smallest  real-world  instances  may  be  tractable  with  this  approach. 

We  overcome  the  intractability  of  the  deterministic  equivalent  by  applying  dynamic  column 
generation  to  a  Dantzig- Wolfe  reformulation  of  the  problem  (Dantzig  and  Wolfe  1960,  Appelgren 
1969).  The  Dantzig- Wolfe  master  problem  represents  a  simplified  deterministic  equivalent  for  the 
problem,  and  subproblems  generate  columns  for  the  master  problem  at  each  node  of  the  scenario 
tree.  The  master  problem  exhibits  structure  that  tends  to  yield  integer  solutions  from  its  linear- 
programming  (LP)  relaxation,  making  it  particularly  easy  to  solve.  When  a  facility  can  be  expanded 
at  most  once  over  the  planning  horizon,  model  simplifications  enhance  performance.  Specially 
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structured  subproblems  admit  stronger  formulations  that  further  enhance  performance,  and  “duals 
stabilization”  for  the  master  problem  (e.g.,  du  Merle  et  al.  1999)  dramatically  improves  solution 
times  for  all  problem  variants. 

The  literature  on  stochastic  capacity-planning  problems  is  extensive:  Luss  (1982)  and  Van 
Mieghem  (2003)  present  comprehensive  surveys.  Mamie’s  seminal  paper  (Manne  1961),  which  mod¬ 
els  demand  growth  as  an  infinite-horizon  stochastic  process,  stimulated  much  research  on  infinite- 
horizon  models  (e.g.,  Giglio  1970,  Freidenfelds  1980).  However,  such  models  cannot  incorporate  the 
complex  operational  constraints  that  many  real-world  applications  require. 

More  recent  studies  incorporate  application-specific  constraints.  For  instance,  Sen  et  al. 
(1994)  develop  a  two-stage  model  that  integrates  demand,  capacity  expansion,  and  budget  con¬ 
straints,  although  it  assumes  only  continuous  capacity-expansion  decisions  and  a  single  capacity- 
expansion  technology.  The  authors  solve  the  model  using  a  sampling-based  stochastic-decomposition 
algorithm. 

The  assumptions  of  a  discrete  probability  distribution  for  uncertain  parameters  and  a  fi¬ 
nite  planning  horizon  mean  that  a  set  of  scenarios  can  represent  uncertain  outcomes  resulting  in 
a  (possibly  large-scale)  mathematical  programming  problem.  In  this  framework  it  is  possible  to 
include  a  detailed  operational  model  and  “strategic  details”  such  as  a  variety  of  capacity-expansion 
technologies.  Berman  et  al.  (1994)  present  and  solve  a  scenario-based  multi-stage  model  with  a 
single  capacity-expansion  technology.  Chen  et  al.  (2002)  extend  this  concept  to  multiple  capacity- 
expansion  technologies,  and  also  model  economies  of  scale.  However,  both  of  these  approaches 
assume  continuous  capacity-expansion  decisions. 

The  use  of  integer  variables  to  model  fixed-charge  cost  functions  and  economies  of  scale  adds 
considerable  complexity.  Eppen  et  al.  (1989),  Riis  and  Andersen  (2002),  Riis  and  Lodahl  (2002), 
and  Barahona  et  al.  (2005)  model  these  using  integer  variables  in  the  first-stage  of  two-stage  models. 

In  recent  years,  increased  computing  power  and  advances  in  optimization  techniques  have 
made  it  possible  to  develop  and  solve  multi-stage  stochastic  integer-programming  models.  Ahmed 
et  al.  (2003)  solve  such  problems  with  a  special  branch-and-bound  procedure  that  incorporates 
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a  heuristic  upper-bounding  method.  Ahmed  and  Sahinidis  (2003)  and  Huang  and  Ahmed  (2005) 
propose  approximation  schemes  that  asymptotically  converge  to  an  optimal  integer  solution  as  the 
planning  horizon  lengthens. 

Dynamic  programming,  though  limited  in  its  ability  to  integrate  practical  constraints,  ap¬ 
pears  in  a  few  recent  applications.  Laguna  (1998)  solves  a  two-stage  model,  which  Riis  and  Anderson 
(2004)  extend  to  multiple  stages.  Rajagopalan  et  al.  (1998)  present  a  multi-stage  model  with  deter¬ 
ministic  demand,  but  with  uncertainty  in  the  timing  of  the  availability  of  new  capacity-expansion 
technologies. 

A  multi-stage  stochastic  program  with  integer  variables  in  all  stages  does  not  allow  a  nested 
Benders  decomposition  as  does  its  continuous  counterpart.  In  theory,  LP-based  branch  and  bound 
can  solve  the  deterministic  equivalent  for  such  a  problem,  but  practical  instances  usually  exceed  the 
ability  of  today’s  software  and  hardware  to  solve  them.  Decomposition  procedures  based  on  column 
generation  are  becoming  more  common,  however,  for  solving  large  deterministic  integer  programs 
(e.g.,  Liibbecke  and  Desrosiers  2002).  This  has  spawned  new  research  in  solving  stochastic  integer 
programs:  Lulli  and  Sen  (2004)  use  branch  and  price  (column  generation  plus  branch  and  bound) 
for  stochastic  batch-sizing  problems;  Shiina  and  Birge  (2004)  use  column  generation  to  solve  a 
unit-commitment  problem  under  demand  uncertainty;  Damodaran  and  Wilhelm  (2004)  model  high- 
technology  product  upgrades  under  uncertain  demand;  and  Silva  and  Wood  (2004)  solve  a  generic 
class  of  two-stage  problems  by  branch  and  price. 

We  propose  a  new  column-generation  approach  for  solving  multi-stage,  stochastic,  capacity¬ 
planning  problems:  our  master  problem  and  subproblems  differ  substantially  from  those  developed 
by  other  researchers.  Importantly,  the  generality  of  our  approach  should  lend  itself  to  applications 
in  many  industries. 

Our  research  relates  most  closely  to  that  of  Ahmed  et  al.  (2003).  These  authors  present  a 
multi-stage  stochastic  capacity-planning  model  that  includes  continuous  as  well  as  binary  capacity- 
expansion  decisions.  They  disaggregate  the  continuous  variables  using  the  reformulation  strategy 
of  Krarup  and  Bilde  (1977),  which  enables  a  strong  problem  formulation.  Our  approach  differs  in 
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three  major  aspects: 


1.  We  disaggregate  the  binary  capacity-expansion  decisions  rather  than  continuous  ones. 

2.  Random  demand  parameters  directly  determine  a  facility’s  capacity  requirements  in  Ahmed  et 
al.  (2003),  and  operational  constraints  are  simple:  total  installed  capacity  must  meet  or  exceed 
demand  (although,  in  theory,  their  model  can  accommodate  more  complicated  operational 
constraints).  Our  approach  incorporates  a  general  operational-level  submodel,  which  meets 
demand  using  installed  capacity  however  the  modeler  deems  fit. 

3.  Ahmed  et  al.  (2003)  solve  their  mixed-integer  program  using  an  LP-based  branch-and-bound 
algorithm  with  a  heuristic  upper-bounding  scheme;  we  use  column  generation. 

The  remainder  of  this  paper  develops  as  follows.  The  next  section  describes  a  general,  multi¬ 
stage,  stochastic,  capacity-planning  model  with  discrete  capacity-expansion  decisions.  We  formulate 
this  problem  as  a  deterministic-equivalent  mixed-integer  program.  A  revised  reformulation,  using 
the  technique  of  “variable  splitting”,  then  enables  a  Dantzig- Wolfe  decomposition  whose  master 
problem  is  likely  to  be  stronger  than  that  derived  from  the  original  formulation.  Section  3  explores 
the  strength  of  the  decomposition.  Section  4  formulates  a  restricted  form  of  the  general  model  which 
allows  at  most  one  expansion  of  each  facility  over  the  planning  horizon.  In  section  5  we  present 
computational  results  achieved  by  applying  the  general  and  restricted  formulations  to  a  capacity¬ 
planning  problem  for  an  electricity-distribution  network.  The  final  section  presents  conclusions. 

2  A  Multi-Stage,  Stochastic,  Capacity-Planning  Model 

We  follow  Ahmed  et  al.  (2003)  and  represent  uncertainty  using  a  scenario  tree  T  over  T  decision 
stages.  For  simplicity,  we  think  of  these  stages  occurring  at  evenly  spaced  increments  of  time.  The 
uncertain  parameters  represent  a  discrete-time  stochastic  process  defined  on  a  finite  probability 
space.  The  scenario  tree  at  each  stage  t  consists  of  a  set  of  nodes  that  represents  collections  of  states 
of  the  world  that  are  indistinguishable  up  to  time  t.  We  denote  by  n  €  M  the  set  of  nodes  of  the 
scenario  tree. 
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Stage  1  comprises  only  n  =  1,  the  root  node  of  T,  which  is  where  all  scenarios  have  the  same 
realization.  For  each  node  n  £  AT,  (j)n  denotes  the  probability  that  the  state  of  the  world  associated 
with  node  n  occurs.  Tn  denotes  the  successors  of  n  which  we  define  to  include  n  itself.  Thus,  Tn 
denotes  n  plus  all  nodes  “below”  n  in  the  tree.  Vn  denotes  the  set  of  all  predecessors  of  n,  which 
we  define  to  include  n  itself.  Thus,  Vn  denotes  n  plus  all  nodes  “above”  n  in  the  tree.  For  any  leaf 
node  n  in  the  tree,  Vn  defines  a  scenario.  We  now  present  the  compact  formulation  of  our  stochastic 
capacity-planning  model. 

Data 

cn  discounted  cost  vector  for  expanding  capacity  of  system  facilities 
at  scenario-tree  node  n 

qn  discounted  cost  vector  for  operating  the  system  at  scenario-tree  node  n 
uo  vector  of  initial  capacities  of  facilities 

Vn  matrix  that  converts  operating  decisions  and/or  activities  into  capacity 
utilization  at  scenario-tree  node  n 

Ufin  non-negative  matrix  that  converts  capacity-expansion  decisions  at  scenario-tree 
node  h  into  available  operating  capacity  at  successor  node  n  £  7/ 

Variables 

Capacity-expansion  decisions  could  be  very  complicated,  because  we  might  use  various  tech¬ 
nologies  to  expand  a  facility  /,  and  decisions  in  one  time  period  could  affect  decisions  in  another. 
For  simplicity,  the  model  we  describe  here  assumes  that  facility  /  can  be  expanded  at  scenario-tree 
node  n  or  not,  and  can  be  expanded  multiple  times  over  the  planning  horizon.  This  gives: 

x/  vector  of  binary  decisions  for  capacity  expansion  of  facilities  at  scenario-tree 

node  n.  Specifically,  x'jn  =  1  if  facility  /  is  expanded  at  node  n,  0  otherwise. 
yn  vector  of  continuous  and/or  discrete  operating  decisions  at  scenario-tree  node  n 
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Formulation 


CF:  min  ^  </>n  (cjx(,  +  qjy^  (1) 

ne  AT 

s.t.  Fnyn  <  u0  +  ^2  Uhnx!h  Vn  G  AT,  (2) 

hG'Pn 

yn  e  yn  Vn  €  TV,  (3) 

xn  €  {0, 1}  Vn  €  A/".  (4) 


Random  demands,  costs  etc.,  appear  as  the  parameters  subscripted  by  n  in  the  model 
(excluding  <f>n).  Constraints  (3)  represent  generic  relationships  between  the  operational  variables 
yn,  independent  of  all  x^.  Constraints  (2)  ensure  that  adequate  capacity  exists  to  satisfy  the 
operational  requirements  Vnyn  at  node  n.  The  matrices  Uf-m  can  model  lags  between  when  capacity- 
expansion  decisions  are  executed  and  when  capacity  becomes  available,  and,  more  generally,  can 
model  capacity  that  increases  or  decreases  over  time  after  installation. 

Constraints  (2)  and  (3)  can  handle  a  general  operational  model  at  each  node  of  the  sce¬ 
nario  tree.  If  a  set  of  discrete  capacity-expansion  decisions  adequately  models  continuous  capacity 
expansions,  the  “(SCAP)”  model  of  Ahmed  et  al.  (2003)  may  be  viewed  as  an  instance  of  CF.  In 
particular,  this  instance  sets  qra  =  0  and  defines  constraints  (3)  as  yn  =  d„,  where  dn  represents 
demands  at  node  n. 

Capacity-planning  problems  like  CF  typically  have  weak  LP  relaxations,  and  that  makes 
them  difficult  to  solve.  The  scale  imposed  by  a  scenario  tree,  especially  when  some  components 
of  y„  must  be  integer,  exacerbates  this  difficulty.  On  the  other  hand,  an  optimization  model  over 
y n  €  yn,  for  a  single  node  n,  might  be  relatively  easy  to  solve  as  a  mixed-integer  program.  This 
structure  suggests  some  form  of  decomposition. 

2.1  A  Split- variable  Reformulation  and  Dantzig- Wolfe  Decomposition 

The  classical  approach  to  solving  multi-stage  stochastic  linear  programs  uses  nested  Benders  de¬ 
composition  (e.g.  Birge  and  Louveaux  1997,  pp  234-236).  In  general,  however,  integer  variables  in 
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subproblems  makes  Benders  decomposition  inapplicable. 


Our  approach  exploits  Dantzig- Wolfe  decomposition  (Dantzig  and  Wolfe  1960)  together 
with  dynamic  column  generation  (e.g.,  Liibbecke  and  Desrosiers  2002).  As  we  shall  later  discuss,  a 
straightforward  Dantzig- Wolfe  decomposition  of  CF  could  lead  to  a  master  problem  that  provides 
a  weak  lower  bound  on  the  optimal  value  of  CF.  Consequently,  we  first  reformulate  CF  using  a 
variable-splitting  technique  and  then  apply  the  decomposition.  The  split-variable  formulation  is 


SV:  min  ^  4>n  (c^  +  (5) 

neM 

s.t.  xhn  <x.'h  VnGJV,  he  Vn,  (6) 

vnyn  <  u0  +  ^2  UhnXhn  VneAf,  (7) 

h£'Pn 

yn  G  yn  v  n  €  AT,  (8) 

x.'n  e  {0, 1}  Vn  G  AT,  (9) 

Xfen  G  {0, 1}  Vn  G  Af,  he  Vn-  (10) 

The  proof  of  the  following  proposition  is  obvious. 


Proposition  1  (x^,y n)neM  'bS  feasible  for  CF  if  and  only  if  there  exists  (x hn)heVn,neM  such  that 
(x'n,  (x.hn)hE'pn,  yn)  is  feasible  for  SV.  That  is,  CF  and  SV  are  equivalent.  ■ 

In  SV,  for  each  node  n,  and  for  each  of  its  predecessor  nodes  h  G  Vn,  we  define  new  variables 
x/m  that  indicate  whether  capacity  expansions  of  facilities  at  scenario-tree  node  h  contribute  towards 
meeting  the  capacity  requirement  at  node  n.  Here  one  may  think  of  x/m  as  requests  for  capacity 
expansions  at  nodes  h  G  Vn  which,  if  granted,  will  jointly  satisfy  capacity  requirements  at  node  n. 
Constraints  (7)  accumulate  such  requests.  The  variables  x'n  determine  actual  capacity  expansions 
at  node  n  and  can  be  viewed  as  capacity  grants.  Thus  the  natural  interpretation  of  constraints  (6) 
is  variables  x^n  requesting  capacity  and  variables  x^  granting  capacity.  (As  an  alternative,  looking 
“down  the  tree”  from  node  n,  one  may  split  x),  into  variables  xn/,.,  which  indicate  whether  a  capacity- 
expansion  decision  at  node  n  is  exploitable,  non-exclusively,  at  successor  node  h:  this  equivalent 


interpretation  can  be  formalized  by  rewriting  constraints  (6)  as  x„/,  <x'nVn€  A f,  h  G  Tn.) 

The  split-variable  reformulation  has  some  similarities  to  the  reformulation  that  Krarup  and 
Bilde  (1977)  use  to  strengthen  lot-sizing  models,  and  to  the  variable-disaggregation  based  reformu¬ 
lation  used  by  Ahmed  et  al.  (2003)  for  strengthening  stochastic  capacity-expansion  models.  Our 
model  differs  from  those  in  that  the  split  variables  x/m  are  binary  and  force  binary  capacity-expansion 
decisions  x((  that  control  the  amount  of  capacity  expansion.  In  contrast,  Ahmed  et  al.  (2003)  disag¬ 
gregate  continuous  variables  that  force  continuous  and  binary  capacity-expansion  decisions.  (We  do 
not  consider  continuous  capacity  expansions.)  Their  model  strengthens  because  demand  explicitly 
provides  a  lower  bound  on  the  capacity  requirement  of  a  facility,  and  this  allows  the  computation 
of  tighter  constraints. 

Variable  splitting  is  a  common  technique  used  in  stochastic  programming  to  enable  the 
decomposition  of  certain  models.  The  conventional  application  of  this  approach  decomposes  a 
model  by  scenarios.  The  decomposed  model  can  then  be  solved  by  a  variety  of  approaches  such 
as  Lagrangian  relaxation  (Schultz  2003),  the  branch-and-fix  coordination  scheme  ( Alonso- Ayuso  et 
al.  2003),  or  branch  and  price  (Lulli  and  Sen  2004).  Applied  to  CF,  for  each  node  n  G  AT,  this 
approach  would  split  variables  x.'n  and  yn,  into  variables  for  the  stage  t  associated  with  n  and  all 
scenarios  s  that  are  indistinguishable  at  n.  Thus,  the  split  variables  here  would  be  x'ts  and  y ts- 
Because  all  split  variables  for  a  particular  node  n  correspond  to  the  same  realization  of  the  random 
parameters,  their  values  must  be  equal:  “non-anticipativity  constraints”  impose  this  condition  (e.g., 
Birge  and  Louveaux  1997,  p  25).  The  scenario-decomposition  approach  relaxes  the  non-anticipativity 
constraints  to  decompose  the  problem  by  scenario.  The  number  of  non-anticipativity  constraints 
can  be  very  large  as  they  must  be  imposed  on  all  variables  at  each  non-leaf  node.  This  complicates 
scenario-decomposition  procedures. 

In  contrast  to  scenario  decomposition,  the  master  problem  resulting  from  our  decomposition 
of  SV  is  simpler  as  it  only  involves  non-anticipativity  constraints  on  the  variables  x^,  and  not  on  x/jn 
or  yn.  Moreover,  this  structure  allows  us  to  decompose  the  problem  by  scenario-tree  node,  which 
results  in  smaller,  more  manageable  subproblems  (as  described  below). 
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2.2  Dantzig- Wolfe  Reformulation  of  SV 


The  capacity-expansion  constraints  (6)  in  SV  link  capacity  expansions  across  successors  of  a  scenario- 
tree  node;  these  are  “complicating  constraints”  to  what  are  otherwise  a  set  of  simpler  (sub)problems, 
one  for  each  scenario-tree  node  n.  (Subproblem  n  includes  split  variables  x/m  indexed  over  h  £  Vn, 
but  these  variables  are  not  linked  across  subproblems.  Thus,  they  may  be  viewed  as  alternative 
capacity-expansion  choices  for  subproblem  n  alone.)  Thus,  we  can  use  decomposition  to  partition 
the  constraints  of  the  split- variable  formulation  into  two  sets:  the  set  of  linking  (complicating) 
constraints  (6),  and  the  set  of  constraints  specific  to  scenario-tree  node  n,  for  which  we  define 

Xn  =  \  fahn)h£'Pn  I  V  Uo  +  ^  ^  LfrnX/m ,  X^n  £  {0,  1}  V/l  £  ’Pn,  yn  £  Vn  /  •  (H) 

[  hevn  ) 

In  what  follows,  we  find  it  convenient  in  some  situations  to  replace  the  notation  (x hn)hev„  with  the 
more  “vector-oriented”  notation  (xnn  •  •  •  xin)  =  (xnn  xp/n)n  xp(p(n))n  ' ' '  xin)>  where  p(n)  denotes 
the  direct  predecessor  of  node  n. 

Let  Jn  denote  the  index  set  for  the  finite  set  of  vectors  Xn.  whereby  Xn  =  { (xnn  •  •  •  xi n)J  |  j  £  Jn} . 
We  can  then  express  any  element  of  Xn  through 

(xnn  •  •  ■  xin)  =  ^  (xnn  ■  •  •  xi n)jwJn,  Y.w>n  =  l,  wjn  £  {0, 1}  Vj  £  Jn. 

j£jn  j£jn 

Each  vector  (xnn  •  •  •  xin)J  represents  a  collection  of  capacity-expansion  requests  from  nodes  h  £  Vn; 
satisfying  these  requests  will  ensure  feasible  system  operation  at  node  n.  Hence  we  refer  to  each 
collection  of  requests  as  a  feasible  expansion  plan  (FEP). 

Without  loss  of  generality,  we  may  assume  that  each  FEP  has  associated  with  it  at  least 
one  optimal  operational  plan  yn,  he.,  Jn  simultaneously  indexes  FEPs  and  operational  plans  at 
scenario-tree  node  n.  Thus,  we  can  attach  the  operational  costs  qTy«  to  the  wf,,  and  substitute  the 
expression  above  for  (xnn  •  ■  •  xin),  to  obtain  the  Dantzig- Wolfe  reformulation  of  SV.  (See  Dantzig 
and  Wolfe  1960  as  the  seminal  reference  for  models  with  continuous  variables,  and  see  Appelgren 
1969  for  the  extension  to  integer  variables.)  We  denote  this  multi-scenario,  column-oriented  master 
problem  as  “MP” . 
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For  each  scenario  node  n,  MP  contains  a  group  of  columns  with  index  set  Jn.  Each  j  G  Jn 
corresponds  to  an  FEP.  For  simplicity,  we  assume  that  MP  is  always  feasible,  i.e.,  Jn  ^  0  for  any 
n.  The  formulation  for  MP  follows: 

Sets  and  Indices 

j  G  Jn  FEPs  for  scenario-tree  node  n 

Data 

Scjm  binary  vector  representing  capacity-expansion  requests  at  node  h  that  form  part 
of  FEP  j  for  node  n 

y n  operational  plan  used  with  FEP  j 

Variables 

binary  decision  vector  for  capacity  expansion  of  facilities  at  scenario-tree  node  n 
Wn  1  if  FEP  j  is  selected  for  scenario-tree  node  n,  0  otherwise 

Formulation 


MP:  min  ^  0ncjx'ra  +  ^  ^ nSlK 

[dual  variables] 

(12) 

n£j\f  n&N  j£jn 

s-t.  ^2  *hnwi  <  xh  Vn  G  V,  h  G  Vn, 

hn] 

(13) 

j£Jn 

wn  =  1  V  n  G  A f, 

w\ 

(14) 

jeJn 


wi  G  {0,1}  V  n  G  M ,  j  G  Jn, 
x.'n  G  {0, 1}  Vn  G  J\T. 

Note  that  dual  variables  correspond  to  constraints  in  the  LP  relaxation  of  MP,  which  we  denote  as 
MP-LP.  Optimal  dual  variables  for  restricted  versions  of  MP-LP  (and  the  other  master  problem  in 
section  4)  will  be  extracted  for  purposes  of  column  generation. 

MP’s  objective  function  (12)  minimizes  expected  capacity-expansion  costs  plus  expected 
operational  costs.  Constraints  (13)  ensure  that  no  FEP  is  chosen  for  any  node  without  sufficient 
capacity  having  been  installed  (granted).  “Convexity  constraints”  (14)  select  exactly  one  FEP  for 
each  n. 
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Naturally,  the  cardinality  of  Jn  in  MP  will  be  huge,  so  we  solve  MP  using  dynamic  column 
generation.  First,  we  create  a  restricted  master  problem  (RMP)  in  which  each  set  Jn  represents  a 
modest-sized  subset  of  all  the  FEPs  at  scenario-tree  node  n.  We  solve  the  LP  relaxation  of  RMP 
(RMP-LP),  which  replaces  wh  €  {0, 1}  and  x.'n  £  {0, 1}  by  wJn  >  0  and  x'n  >  0,  respectively.  (The 
convexity  constraints  imply  satisfaction  of  wh  <  1  and  x(,  <  1.)  Given  a  solution  to  RMP-LP,  we 
extract  dual  variables,  and  attempt  to  generate  new  columns  corresponding  to  FEPs  with  negative 
reduced  costs,  by  solving  optimization  subproblems  (e.g.,  Barnhart  et  al.  1998,  Liibbecke  and 
Desrosiers  2002). 

The  cycle  of  solving  RMP-LP,  extracting  duals,  and  generating  new  columns  repeats  until 
no  columns  price  favorably,  i.e. ,  no  columns  with  negative  reduced  cost  can  be  found  and  so  we  have 
solved  MP-LP  to  optimality.  If  the  optimal  solution  to  MP-LP  happens  to  be  integer,  then  we  have 
solved  MP.  If  not,  we  may  resort  to  a  branch-and-price  algorithm,  which  generates  columns  within 
a  branch-and-bound  procedure  (Savelsbergh  1997),  or  settle  for  solving  the  RMP  as  an  IP  in  the 
hope  of  getting  a  good  integer  solution. 

A  column  j  for  node  n  in  MP  has  the  form  [0nqjyn,  (xnn  •  •  •  xi n)J,  1]T,  where  yh  is  the 
cost  of  the  associated  operational  plan  y«,  and  (xnn  •  •  -  xi is  the  corresponding  FEP.  Given  the 
optimal  duals,  7cJln  and  fin  from  RMP-LP,  we  can  identify  the  column  j  €  Jn  having  the  most 
favorable  reduced  cost  by  solving  the  subproblem 


SP(ra):  min  </>nqIy™  -  ^  ^hn^hn  ~  fin  (15) 

htz'Pn 

s-t.  Vnyn  <  u0  +  ^2  Uhnxhn,  (16) 

h^fiPn 

yn  e  yn,  (17) 

Xftn  G  {0,1}  VheVn.  (18) 


Any  solution  ((xnn  •  •  •  xin),  yn)  of  SP(n)  with  a  negative  objective  value  lets  us  create  a  new  column 
for  RMP,  i.e.,  add  a  new  element  to  Jn.  If  no  such  solution  exists  for  any  n,  then  we  have  solved 
MP-LP  to  optimality. 
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3  Strength  of  the  Decomposition 


Dantzig- Wolfe  decomposition  of  a  large  LP  replaces  the  direct  solution  of  a  large-scale  problem 
with  a  sequence  of  solutions  of  smaller,  easier-to-solve  problems.  This  indirect  approach  helps  when 
solving  large  MIPs  too.  Decomposition  of  a  MIP  may  also  improve  solution  efficiency  by  defining 
a  master  problem  whose  LP  relaxation  is  stronger  than  the  relaxation  of  the  original  MIP.  The  SV 
reformulation  of  CF  makes  this  possible  in  our  case. 

Recall  that  Dantzig- Wolfe  decomposition  expresses  feasible  points  for  the  LP  relaxation  of 
the  master  problem  as  convex  combinations  of  extreme  points  of  the  convex  hulls  of  the  set  of 
feasible  solutions  for  the  subproblems.  If  each  subproblem  is  simply  an  LP,  then  the  convex  hull 
of  the  set  of  feasible  solutions  is  identical  to  the  LP  feasible  region,  and  optimal  solutions  of  the 
LP  relaxation  of  the  master  problem  will  have  the  same  value  as  the  LP  relaxation  of  the  original 
problem.  On  the  other  hand,  if  the  convex  hull  of  a  subproblem’s  feasible  solutions  is  smaller  than 
its  LP  feasible  region — for  example  when  the  subproblem  is  an  IP  whose  LP  relaxation  does  not 
have  integer  extreme  points — then  the  resulting  master  problem  can  have  a  tighter  relaxation  than 
that  of  the  original  MIP  (Barnhart  et  al.  1998). 

In  CF,  we  might  consider  applying  a  conventional  Dantzig- Wolfe  reformulation  to  the  capacity- 
expansion  constraints  (2).  This  results  in  subproblems  for  each  scenario-tree  node  n  with  operational 
constraints  (3)  only  over  34  •  Indeed,  in  the  not-uncommon  case  in  which  the  yn  are  continuous 
variables,  the  subproblems  for  a  decomposition  of  CF  are  LPs,  and  no  strengthening  is  obtained. 

On  the  other  hand,  in  the  Dantzig- Wolfe  decomposition  of  SV,  the  subproblems  SP(n)  can 
be  viewed  as  single-period,  discrete,  capacity-expansion  problems,  which  can  be  shown  to  be  NP- 
hard  (by  transformation  to  minimum-cover  problems).  Thus,  they  do  not  have  LP  relaxations  with 
integer  extreme  points,  and  so  our  decomposition  gives  a  master  problem  whose  LP  relaxation  is 
stronger  than  that  of  the  SV  model.  For  example,  in  one  of  our  test-problem  instances  the  optimal 
objective  value  for  the  LP  relaxation  of  SV  equals  123,388;  in  comparison,  the  corresponding  MP-LP 
has  an  optimal  objective  value  of  960,881,  a  779%  improvement. 
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It  is  also  remarkable  that  MP-LP  almost  always  has  an  optimal  integer  solution.  Because 
the  constraint  matrix  for  this  problem  has  coefficients  that  are  either  0  or  1,  it  is  easy  to  see  that 
fixing  the  iUn  to  binary  values  leads  to  binary  solutions  for  x.'n  even  when  these  variables  are  allowed 
to  be  continuous.  For  each  node  n  in  the  scenario  tree,  the  submatrix  corresponding  to  the  variables 
Wn  has  a  perfect-matrix  structure  (Padberg  1974).  These  perfect  submatrices  prevent  fractional 
solutions  from  occurring  within  a  single  block  of  variables  iUn,  j  €  Jn,  thus  making  it  less  likely  for 
fractional  solutions  to  occur  in  MP-LP.  (See  Ryan  and  Falkner  1988  for  an  account  of  this  effect 
in  set-partitioning  problems.)  On  the  other  hand,  the  constraint  matrix  of  MP-LP  as  a  whole  may 
not  be  perfect  since  it  has  constraints  on  x!n  that  link  its  (perfect)  submatrices.  Consequently,  the 
interaction  between  these  submatrices  can  give  rise  to  fractional  solutions,  although  we  find  that 
these  occur  only  rarely  in  practice.  (Section  5  provides  an  example  of  a  fractional  optimal  solution.) 

4  At  Most  One  Capacity  Expansion  of  a  Facility 

The  general  model  SV  allows  a  facility’s  capacity  to  be  expanded  more  than  once  over  the  plan¬ 
ning  horizon.  However,  in  some  industries,  over  reasonably  long  horizons,  planning  for  multiple 
expansions  makes  little  sense  because  associated  fixed  charges  are  large,  or  “setups”  have  highly 
undesirable  side  effects. 

This  section  therefore  studies  a  version  of  SV  that  restricts  each  facility  to  being  expanded 
at  most  once  over  the  planning  horizon.  We  also  assume  that  Uf in  is  deterministic  and  does  not 
evolve  with  the  scenario  tree  or  change  over  time,  i.e.,  Uhn  =  U  V  n  G  A f,  h  €  Vn.  With  these 
changes,  SV  becomes: 


SV  1':  min  V  (j)n  (c^x'n  +  q^y„ ) 

(19) 

neAf 

s.t.  xhn  <xh  Vn  <E  N,  he  Vn, 

(20) 

vnyn  <  u o  +  U  xhn  V  n  G  J\f, 

(21) 

hG.'Pn 

Y;  Xft,  <  1  v  n  e  N, 

(22) 
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y n  £  yn  VneJ\f, 


(23) 


G  {0, 1}  Vn  G  A f,  (24) 

xhn  G  {0, 1}  Vn  G  TV,  h  G  Pn.  (25) 

The  reader  will  note  that  constraints  (22)  for  non- leaf  nodes  are  redundant.  However,  we 
include  all  these  constraints  in  R.MP-LP  because,  for  reasons  we  cannot  explain,  the  Dantzig- Wolfe 
algorithm  tends  to  solver  faster  that  way.  (Of  course,  we  turn  off  the  optimizer’s  presolver  when 
solving  these  LP  relaxations  so  that  it  does  not  eliminate  the  redundant  constraints.) 

It  is  convenient  to  transform  SVT  into  an  equivalent  formulation  with  fewer  variables: 


min  ^2  <t> n  (cn  xn  +  cln  YnJ 

new 

(26) 

s.t.  xn  <  ^2  xh  Vn  G  A f, 

h(fzPn 

(27) 

Vnyn  <  u 0  +  Uxn  Vn  G  A f, 

(28) 

£x',<l  V  n  G  A /, 

tlG'Pn 

(29) 

yn  g  yn  V  n  G  A r, 

(30) 

x'n  G  {0, 1}  Vn  G  A f, 

(31) 

xn  G  {0, 1}  Vn  G  A f. 

(32) 

SVl'  and  SV1  are  equivalent  problems  by  virtue  of  the  following  proposition. 

Proposition  2  There  exists  (^hn)heVn  with  (x/m)ftep„,yti)  being  feasible  for  SV 1'  if  and  only 
if  there  exists  xn  such  that  (x'n,xn,yn)  is  feasible  for  SVl. 

Proof.  Suppose  (x(,,  (x/,,n)/jG-pn, yn)  is  feasible  for  SVl'.  Let  xn  =  YlheVn  X/m-  To  sh°w  that 
( x'n,xn,yn )  is  feasible  for  SVl,  it  suffices  to  check  that  constraints  (27),  (28)  and  (32)  are  satisfied. 
Constraints  (20)  imply  (27),  and  constraints  (21)  give  (28).  Moreover,  xn  is  binary  because  of  (20) 
and  (22). 
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Conversely,  if  (x((,  xn,  yn)  is  feasible  for  SV1,  then  let  x/in  =  x'^  for  all  h  G  "Pn.  All  constraints 
of  SV1'  hold  trivially,  except  for  (21).  These  constraints  are  satisfied  because 

VnYri.  4  Uo  +  Uxn  <  Uo  +  U  J2x'h  =  uo+u  Y  X-hn- 

hG'Pn  heVn 

This  completes  the  proof.  ■ 

We  can  now  formulate  a  Dantzig- Wolfe  decomposition  of  SV1,  analogous  to  that  of  section 
2.2,  by  defining 

Xn  =  {xn  |  f4yn  <  u0  +  Ux.n,  x„  G  {0, 1},  yn  G  34}  , 
and  by  expressing  xn  through  xi,  j  G  Jn,  which  denote  the  enumerated  feasible  solutions  in  Xn: 


xn  =  Y  xn'<,  Y  W «  =  wi  €  {0,  1},  Vj  G  Jn- 

je.7„  jet7„ 

This  gives  a  simplified  master  problem 


MP1:  min  Y  +  X  X 

nSAf  n£j\f  j£jn 

[dual  variables] 

(33) 

s-t.  Y*nK<  X4  Vn  G  A7, 
jeJn  h.&Vn 

kn] 

(34) 

Y*h<  1  Vn  G  A7, 
h^'Pn 

(35) 

YK  =  IVnGA/-, 

K] 

(36) 

je>7n 


<  G  {0, 1}  Vn  G  A/",  j  G  Jn, 
x!n  G  {0, 1}  Vn  G  AT, 

and  a  simpler  subproblem 

SPl(n):  min  0nqjy„  ~  n nxn  -  (37) 

s.t.  14yn  <  u0  +  t/x^,  (38) 

yn  £  34)  (39) 

x„G{0,l}.  (40) 
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Recall  that  SP(n)  includes  binary  variables  x^n  for  all  nodes  h  6  Vn .  In  contrast,  SPl(n)  incorpo¬ 
rates  only  binary  variables  xn.  Thus,  the  number  of  binary  variables  in  SPl(n)  reduces  by  a  factor 
of  V„ | ,  which  can  make  this  subproblem  easier  to  solve. 

5  Computational  Results 

This  section  applies  the  SV  and  SV1  formulations  to  instances  of  a  model  for  planning  the  capacity 
expansion  of  an  electricity  distribution  network  subject  to  uncertain  demand.  The  details  of  this 
class  of  model  have  been  described  in  Singh  (2004),  so  we  give  only  a  brief  description.  A  distribution 
network  is  the  low-voltage  part  of  the  electricity  system  supplying  customers  from  a  single  source 
(typically  a  substation  connected  to  generating  plants  through  the  high  voltage  transmission  system). 
For  each  demand  realization,  the  distribution  network  of  interest  must  operate  in  a  radial  (tree) 
configuration,  which  means  that  power  flows  from  the  source  to  each  demand  point  along  a  unique 
path  of  power  lines.  Typically,  each  power  line  has  a  switch  at  either  end  that  can  be  open  or 
closed,  and  although  the  full  network  has  an  underlying  mesh  structure,  it  is  operated  in  a  radial 
configuration  by  opening  and  closing  these  switches. 

The  configuration  of  the  switches  is  determined  by  binary  variables  within  constraints  yn  £ 
yni  which  must  be  satisfied  at  each  scenario-tree  node  n.  This  makes  each  subproblem  (SP(n)  or 
SPl(n))  a  challenging  mixed-integer  program  in  its  own  right.  A  “super-network  model”  for  any 
subproblem  provides  a  stronger  LP  relaxation  for  that  subproblem.  This  model  replaces  certain 
sets  of  nodes  and  edges  with  simpler  constructs  involving  “super-nodes”  and  “super-edges”  which 
reduces  the  number  of  binary  variables,  and  exploits  some  problem-specific  valid  inequalities;  see 
Singh  et  al.  (2004)  for  details.  We  make  use  of  this  strengthened  formulation  in  all  of  the  tests 
reported  here. 

We  report  results  for  seven  problem  instances,  which  differ  by  the  number  of  stages  in 
a  binary  scenario  tree  (five  problems)  and  the  number  of  stages  in  a  ternary  scenario  tree  (two 
problems).  All  problem  instances  derive  from  data  for  an  actual  distribution  network  in  Auckland, 
New  Zealand.  The  network  data  comprise  152  nodes,  most  of  which  are  demand  points,  and  182 
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edges.  All  problem  instances  have  been  designed  so  that  an  optimal  solution  always  exists  in  which 
no  edge  is  expanded  more  than  once  over  the  planning  horizon.  This  allows  us  to  apply  both  SV1 
and  SV  formulations  and  make  direct  comparisons. 

We  have  implemented  and  tested  our  algorithms  on  a  desktop  computer  with  a  Pentium  IV 
2.6  GHz  processor,  and  1  GB  of  RAM.  We  generate  all  models,  and  implement  our  decomposition 
algorithms  within  the  Mosel  algebraic  modeling  system,  version  1.24,  from  Dash  Optimization.  The 
LP  restricted-master  problems  are  solved  with  Xpress  Optimizer,  version  14.24,  also  from  Dash 
Optimization,  but  the  MIP  subproblems  and  the  deterministic-equivalent  models  are  solved  with 
CPLEX,  version  9.0  from  ILOG,  Inc. 

Solver  settings  remain  constant  throughout  all  tests.  All  MIPs  are  solved  with  default 
parameter  settings  except  that  Gomory  cuts  are  turned  off  and  a  moderate  level  of  probing  is  used 
(CPX_PARAM_PROBE  =  2).  All  subproblems  are  solved  to  optimality  and  the  deterministic- 
equivalent,  problems  are  solved  with  a  relative  optimality  tolerance  of  1.0%.  The  time  to  solve  each 
problem  instance  is  limited  to  7,200  seconds. 

Observe  that  any  (nontrivial)  instances  of  RMP-LP  will  be  infeasible  unless  one  feasible 
column  (FEP)  exists  for  each  scenario-tree  node.  We  could  use  the  classical  “Phase  I”  approach  to 
finding  an  initial  feasible  solution  (e.g.,  Dantzig  and  Thapa  2003,  pp  291-292),  but  it  is  simpler  to 
guarantee  such  a  solution  by  seeding  the  master  problem  with  one  FEP  for  each  scenario-tree  node. 
Except  for  trivially  infeasible  problems,  an  FEP  for  each  node  that  requires  all  possible  capacity 
expansions  will  surely  be  feasible,  so  those  generate  our  initial  columns. 

Any  such  FEP  translates  into  a  column  in  RMP-LP  that  has  coefficients  of  1  in  the  capacity- 
expansion  constraints  for  each  facility,  a  coefficient  of  1  in  the  convexity  constraint  for  the  corre¬ 
sponding  scenario-tree  node,  and  0s  elsewhere.  Note  that  our  application  imposes  no  operational 
costs,  so  these  initial  columns,  as  well  as  the  columns  generated  later,  all  have  cost  coefficients  of  0. 

At  each  iteration  of  the  Dantzig- Wolfe  decomposition,  a  lower  bound  zLP  on  z^P,  the  optimal 
objective  value  for  MP-LP,  is  readily  available.  In  particular,  using  the  arguments  in  Wolsey  (1998, 
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p  189),  it  is  easy  to  show  that 


zLp  —  zLP  +  ^  Sn  <  z*LP ,  (41) 

n&M 

where  zlp  and  Sn  denote  the  optimal  objective  values  for  RMP-LP  and  SP(n)  at  the  current  it¬ 
eration,  respectively.  Note  that  this  lower  bound  is  only  valid  when  “full  pricing”  is  invoked,  i.e., 
after  all  subproblems  SP(n),  n  €  N  have  been  solved  to  optimality.  At  any  particular  iteration,  it  is 
easy  to  compute  an  upper  bound  zjp  on  the  optimal  integer  objective  of  MP  by  solving  the  integer 
RMP  (RMP-IP)  with  the  existing  set  of  columns  (assuming  this  is  feasible).  We  define  the  (relative) 
optimality  gap  for  the  master  problem,  “MP-Gap”,  as  100%  x  (zip  —  zLP)/zLP.  MP-Gap  gives  an 
optimality  check  on  our  algorithm  which  can  be  used  to  terminate  the  Dantzig- Wolfe  decomposition 
early  if  it  has  decreased  to  a  tolerable  level. 

Observe  that  when  the  solution  to  RMP-LP  is  fractional,  we  must  solve  RMP-IP  to  obtain 
ziP,  which  can  be  expensive  if  carried  out  at  every  iteration.  Thus  for  the  overall  efficiency  of  the 
algorithm,  the  number  of  such  checks  should  be  minimized.  As  an  empirical  rule,  we  start  checking 
the  MP-Gap  at  the  first  iteration  when  the  gap  between  the  RMP-LP  objective  and  the  lower 
bound,  “LP-Gap” ,  reaches  80%  of  a  (preset)  termination  tolerance.  For  instance,  for  a  termination 
tolerance  of  5%,  we  start  checking  MP-Gap  when  LP-Gap  reaches  4%.  After  the  first  check,  we 
re-solve  RMP-IP  with  a  branch-and-bound  algorithm  only  when  RMP-LP  yields  fractional  solutions 
for  five  consecutive  iterations.  We  demonstrate  the  effect  of  termination  tolerances  on  solution  times 
later. 

Unfortunately  our  Dantzig- Wolfe  master  problems  suffer  from  severe  dual  degeneracy.  Con¬ 
sequently,  convergence  using  a  conventional  Dantzig- Wolfe  algorithm  is  slow,  ranging  from  hours  to 
days.  To  improve  convergence,  we  apply  “duals  stabilization”  in  the  RMP-LP,  and  compare  two 
different  methods:  du  Merle  et  al.  (1999)  describe  the  first,  which  we  call  “du  Merle  stabilization”; 
the  other  simply  generates  interior-point  dual  solutions  by  solving  RMP-LP  using  an  interior-point 
algorithm.  For  lack  of  a  better  phrase,  we  call  this  technique  “interior-point  duals  stabilization”. 

The  optimal  solutions  of  MP-LP  are  invariably  integer  in  our  test  problems.  Consequently, 
we  have  not  required  a  full  branch- and-price  solution  procedure.  It  is  interesting  to  note,  however, 
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that  fractional  optimal  solutions  are  possible,  at  least  in  the  master  problem  of  the  SV  formulation; 
see  figure  1  for  an  example  network  and  figure  2  for  the  corresponding  MP-LP. 
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Figure  1:  Data  for  an  example  in  which  the  master  problem  of  SV  formulation  has  a  fractional  LP 
solution.  The  diagram  on  the  left  represents  a  distribution  network  with  three  edges  that  connect 
supply  node  3  to  demand  nodes  1  and  2.  The  tables  on  the  right  contain  data  for  a  single-scenario, 
2-stage  problem  instance,  i.e.,  A f  =  {1,  2};  here,  Uehn  =  Ue,  Vn  G  A f,  h  €  Vn. 
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Figure  2:  Constraint  matrix  and  LP  solution  to  the  master  problem  of  SV  for  the  2-stage  single¬ 
scenario  problem  specified  in  figure  1.  The  solution  is  fractional. 


The  fractions  arise  from  the  interaction  of  requests  by  scenario-tree  nodes  1  and  2  for  capacity 
expansions  on  edges  1  and  2  at  scenario-tree  node  1.  Interestingly,  however,  an  alternate,  integer, 
optimal  solution  exists:  x'u  =  0,  x'2X  =  1,  x31  =  1,  x\2  =  1,  x'22  =  1,  x32  =  1,  raj  =  0,  w\  =  1,  wf  = 
0,  w\  =  0  and  w\  =  1. 
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Scenario-tree  Statistics 

Deterministic  Equivalent 

Dantzig-Wolfe  Decomposition 

Stages 

(number) 

Scenarios 

(number) 

Scenario- 

tree  nodes 

(number) 

SV-DE 

(CPU  sec.) 

SV1-DE 

(CPU  sec.) 

SV-DW-M 

(CPU  sec.) 

SV-DW-I 

(CPU  sec.) 

SV1-DW-M 

(CPU  sec.) 

SV1-DW-I 

(CPU  sec.) 

2 

2 

3 

7.5 

2.5 

20.4 

55.9 

4.3 

17.7 

3 

4 

7 

(1.8%) 

1457.6 

- 

203.5 

95.0 

55.8 

4 

8 

15 

(42.2%) 

(34.9%) 

- 

2852.3 

638.1 

284.5 

5 

16 

31 

(69.3%) 

(65.6%) 

- 

(85.1%) 

3624.2 

1212.8 

6 

32 

63 

* 

* 

- 

- 

- 

4301.4 

5 

81 

121 

* 

* 

- 

(26.9%) 

7043.5 

2812.3 

6 

243 

364 

* 

* 

- 

- 

- 

(7.6%) 

Table  1:  Solution  times  for  each  procedure.  The  values  in  parentheses  are  the  relative  optimality 
gaps  achieved  at  7,200  seconds.  An  asterisk  denotes  that  no  integer  feasible  solution  was  found  in 
7,200  seconds  and  a  dash  indicates  that  the  optimality  gap  was  more  than  100%. 


We  use  the  following  abbreviations  to  denote  the  various  formulations  discussed  in  earlier 
sections. 


Abbreviation 

SV-DE 

SV1-DE 

SV-DW-M 

SV-DW-I 

SV1-DW-M 

SV1-DW-I 


Formulation  and  Solution  Procedure 

general  split-variable  formulation  SV,  solved  as  a  deterministic  equivalent 
specialized  split-variable  formulation  SV1  that  allows  the  expansion  of 
a  facility  at  most  once  in  a  scenario,  solved  as  a  deterministic  equivalent 
Dantzig- Wolfe  decomposition  of  SV  with  du  Merle  duals  stabilization 
Dantzig- Wolfe  decomposition  of  SV  with  interior-point  duals  stabilization 
Dantzig- Wolfe  decomposition  of  SV1  with  du  Merle  duals  stabilization 
Dantzig- Wolfe  decomposition  of  SV1  with  interior-point  duals  stabilization 


Table  1  displays  the  scenario-tree  statistics  for  the  seven  problem  instances,  along  with  their 
solution  times  as  deterministic  equivalents,  or  using  Dantzig- Wolfe  decomposition.  These  results 
illustrate  the  power  of  decomposition  in  solving  the  larger  problem  instances. 

The  test  problems  are  quite  large.  The  largest  SV  instance  we  can  solve  with  decomposition 
has  5  stages  and  81  scenarios.  It  results  in  an  SV-DE  model  having  158,602  binary  variables  and 
194,864  constraints.  The  corresponding  SV1-DE  model  has  81,070  binary  variables  and  117,332  con- 
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straints.  Both  models  have  15,004  continuous  variables.  Neither  CPLEX  9.0  nor  Xpress  Optimizer 
14.24  can  solve  either  of  these  problems  in  one  day  of  computing  time. 

For  this  same  instance,  the  largest  subproblems  for  SV-DW  have  only  1,216  binary  variables, 
while  the  SV1-DW  subproblems  have  just  488  binary  variables.  The  subproblems  share  the  same 
124  continuous  variables  and  800  constraints,  and  each  solves  in  under  3  seconds  on  average.  (Recall 
that  the  number  of  binary  variables  in  the  SV-DW  subproblem  for  node  n  increases  with  its  depth 
in  the  scenario  tree.  Thus,  the  subproblems  for  leaf  nodes  are  the  largest.) 

The  master  problems  for  SV-DW  and  SV1-DW  are  of  modest  size,  too.  The  restricted 
SV1-DW-I  master  problem  for  the  5-stage-8 1-scenario  problem  has  only  23,161  variables  in  its  last 
iteration,  iteration  18  (see  Table  3),  and  requires  only  8.5  seconds  to  solve.  In  all  iterations  it 
has  44,165  constraints.  The  SV-DW  master  problem  always  has  more  constraints  (see  section  2.2), 
but  its  linear-programming  relaxation  usually  solves  quickly,  too.  The  SV-DW  master  problem  has 
99,675  constraints  for  the  5-stage-81-scenario  problem  instance.  Although  SV-DW-I  cannot  solve 
this  problem  in  under  7,200  seconds,  at  iteration  18  its  LP  master  problem  has  24,181  variables  and 
solves  in  7.3  seconds,  while  at  iteration  92  the  number  of  variables  grows  to  27,808,  but  still  requires 
only  9.9  seconds  to  solve. 

Our  results  show  that  interior-point  duals  stabilization  is  an  important  adjunct  to  the  de¬ 
composition  methodology,  and  that  it  is  clearly  superior  to  du  Merle  stabilization.  For  the  2-stage- 
2-scenario  problem  instance,  the  du  Merle  stabilization  requires  extensive  tuning  of  its  parameters 
to  get  SV-DW-M  to  converge.  We  also  spent  considerable  effort  tuning  parameters  for  the  3-stage-4- 
scenario  problem  instance,  but  without  success  (as  indicated  by  the  dash).  In  contrast,  the  interior- 
point  duals  stabilization  requires  no  tuning  (other  than  ensuring  that  the  standard  “crossover” 
to  a  basic  feasible  solution  is  disabled),  and  it  significantly  outperforms  the  du  Merle  alternative. 
Nonetheless,  the  results  of  both  duals-stabilization  schemes  exhibit  the  well-known  tailing-off  effect. 
Thus,  terminating  the  Dantzig- Wolfe  decomposition  early  by  setting  an  acceptable  optimality  tol¬ 
erance  for  MP-Gap  may  still  give  good  solutions,  without  incurring  the  excessive  computation  time 
that  it  can  take  to  reach  optimality.  Table  2  reports  the  time  it  takes  SV-DW-I  and  SV1-DW-I  to 
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satisfy  tolerances  of  5%,  1%  and  0%. 


Scenario-tree  Statistics  Dantzig-Wolfe  Decomposition 


Stages 

Scenarios 

Scenario- 

tree  nodes 

SV-DW-I  (CPU  sec.) 

SV  1-DW-I  (CPU  sec.) 

(number) 

(number) 

(number) 

5% 

1% 

0% 

5% 

1% 

0% 

2 

2 

3 

30.9 

35.6 

55.9 

14.1 

14.1 

17.7 

3 

4 

7 

151.4 

174.0 

203.5 

33.9 

52.0 

55.8 

4 

8 

15 

1886.3 

2088.7 

2852.3 

188.1 

188.1 

284.5 

5 

16 

31 

16908.2 

21355.2 

24620.0 

838.8 

935.5 

1212.8 

6 

32 

63 

- 

- 

- 

2303.4 

3286.7 

4301.4 

5 

81 

121 

18005.3 

21875.7 

29820.7 

1171.2 

1370.4 

2812.3 

6 

243 

364 

- 

- 

- 

7407.1 

11146.2 

23637.9 

Table  2:  Computation  times  for  SV-DW-I  and  SV1-DW-I  to  reach  relative  optimality  gaps  of  5%, 
1%  and  0%. 

Table  3  reports  the  corresponding  number  of  restricted  master-problem  iterations.  As  shown 
in  this  table,  the  SV-DW  decomposition  requires  many  more  iterations  to  converge  than  SV1-DW.  As 
observed  above,  the  differences  in  the  average  solution  times  between  the  restricted  master  problems 
and  subproblems  for  SV  and  SV1  are  relatively  small.  So  the  large  differences  seen  in  overall  solution 
times  clearly  result  from  SV-DW-I  requiring  many  more  iterations  than  SV1-DW-I.  (It  is  interesting 
to  see  that  the  number  of  iterations  for  SV  1-DW-I  does  not  increase  commensurately  with  problem 
size,  at  least  for  this  application.  This  bodes  well  for  solving  even  larger  problems.) 

It  is  important  to  note  that  the  subproblems  for  this  particular  application  are  difficult, 
deterministic  network-design  problems  (Johnson,  Lenstra  and  Rinnooy  Kan  1978).  For  this  reason, 
and  because  we  solve  one  subproblem  for  each  scenario-tree  node  in  each  iteration,  the  total  time 
spent  solving  subproblems  is  substantial.  SV-DW-I  spends  93.7%  of  its  time  solving  subproblems 
while  SV1-DW-I  spends  98.2%,  averaged  over  the  problems  both  methods  can  solve.  Clearly,  then, 
any  improvement  in  solution  time  for  subproblems  will  improve  overall  solution  time  almost  as  much. 
All  of  the  technology  that  has  proved  useful  for  solving  deterministic  network-design  problems  is 
worth  evaluating  for  this  purpose  (e.g.,  Bienstock  and  Muratore  2000,  Magnanti  and  Raghavan 
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Scenario-tree  Statistics 

Dantzig-Wolfe  Decomposition 

Stages 

(number) 

Scenarios 

(number) 

Scenario- 

tree  nodes 

(number) 

SV-DW-I  (iterations) 

SV  1-DW-I  (iterations) 

5% 

1% 

0% 

5% 

1% 

0% 

2 

2 

3 

17 

19 

26 

11 

11 

13 

3 

4 

7 

27 

30 

35 

10 

14 

15 

4 

8 

15 

67 

72 

88 

10 

10 

13 

5 

16 

31 

183 

221 

245 

12 

13 

16 

6 

32 

63 

- 

- 

- 

11 

14 

17 

5 

81 

121 

63 

73 

92 

10 

11 

18 

6 

243 

364 

_ 

_ 

_ 

11 

14 

23 

Table  3:  Number  of  iterations  for  SV-DW-I  and  SV1-DW-I  to  reach  relative  optimality  gaps  of  5%, 
1%  and  0%. 

2005). 

As  a  final  note,  models  that  fit  the  SV  or  SV1  paradigm,  but  which  have  simpler  subproblems, 
may  solve  very  quickly.  For  instance,  the  multi-stage  stochastic  model  of  Riis  and  Anderson  (2004) 
does  fit  the  paradigm  of  SV,  and  its  subproblems  are  simple  knapsack  problems,  easily  solved  by 
dynamic  programming. 

6  Conclusions 

We  have  described  a  general,  compact  formulation  of  a  multi-stage  stochastic  integer-programming 
model  for  planning  the  capacity  expansion  of  a  production  system  with  one  or  more  production 
facilities.  Capacity-expansions  are  discrete,  and  a  scenario  tree  represents  uncertainty. 

We  reformulate  the  compact  formulation  using  a  variable-splitting  technique  to  give  a  gen¬ 
eral,  split-variable  model  (SV)  that  allows  multiple  capacity  expansions  of  a  facility  over  the  plan¬ 
ning  horizon.  Based  on  SV  we  also  devise  a  split- variable  model  (SV1)  that  restricts  each  facility 
to  at  most  one  capacity  expansion  over  the  planning  horizon.  A  Dantzig- Wolfe  reformulation  of 
either  model  results  in  a  master  problem  having  a  substantially  stronger  LP  relaxation  than  the 
deterministic-equivalent  formulation. 
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For  each  node  n  in  the  scenario  tree,  we  define  Vn  to  be  the  set  of  all  predecessors  of  n, 
including  n  itself.  Apart  from  variables  x/m,  which  denote  requests  for  capacity  to  be  installed 
in  nodes  h  6  Vn,  the  variables  in  a  subproblem  SP(n)  for  the  Dantzig- Wolfe  reformulation  of  SV 
pertain  only  to  node  n.  Indeed  these  variables  can  be  viewed  in  the  subproblem  simply  as  alternative 
capacity-expansion  options  at  node  n  of  the  scenario  tree.  As  a  result,  the  subproblems  increase 
in  difficulty  only  slightly  with  an  increasing  number  of  stages  in  a  scenario  tree.  In  SV1,  the 
situation  is  even  better,  because  the  column-generation  subproblems  involve  no  variables  (such  as 
x/m)  from  predecessor  nodes  in  the  scenario  tree.  Thus,  these  subproblems  do  not  become  larger 
as  the  number  of  stages  increases.  This  situation  contrasts  with  scenario-decomposition  methods  in 
which  the  subproblems  must  cover  an  entire  planning  horizon,  and  so  increase  in  size  as  more  stages 
are  added. 

We  have  applied  our  methods  to  solve  a  capacity-planning  problem  for  an  electricity-distribution 
network,  which  requires  the  use  of  mixed-integer  subproblems.  However,  the  algorithm  described 
is  quite  general.  As  long  as  good  algorithms  exist  to  solve  them,  the  subproblems  can  incorporate 
arbitrary  non-linearities  or  other  complexities,  which  other  applications  may  require. 

The  efficiency  of  column  generation  hinges  on  the  use  of  a  good  duals-stabilization  scheme 
for  the  master  problem.  For  our  application,  the  “interior-point  duals  stabilization”  scheme,  which 
obtains  dual  variables  from  an  interior-point  algorithm,  greatly  outperforms  the  well-known  scheme 
of  du  Merle  et  al.  (1999).  Note  that  in  our  implementation  of  the  interior-point  method,  we  re-solve 
the  master  problems  from  a  cold-start  after  adding  a  new  set  of  columns.  There  is  some  potential 
to  increase  the  speed  of  our  algorithm  by  re-solving  the  master  problems  faster,  using  a  suitable 
hot-start  procedure  for  interior-point  methods  (e.g.,  Gondzio  and  Grothey  2003). 

Our  split-variable  formulation  uses  inequality  non-anticipativity  constraints.  The  validity 
of  these  constraints  relies  on  the  assumption  that  capacity  expansions  are  non-negative  quanti¬ 
ties.  If  this  assumption  were  to  be  removed  (for  example,  to  admit  facility  closures)  then  the 
non-anticipativity  constraints  must  be  replaced  by  equalities  in  order  to  make  SV  correspond  to 
CF.  Based  on  this  observation,  it  is  tempting  to  suppose  that  general  multi-stage  stochastic  integer- 
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programming  problems  might  be  profitably  attacked  using  the  approach  outlined  in  this  paper.  Our 
experiments  show  that  a  formulation  with  an  equality-constrained  master  problem  can  be  solved 
using  this  approach,  albeit  with  some  increase  in  computational  effort.  For  small  problems,  this  is  a 
modest  increase,  but  the  larger  problems  take  up  to  ten  times  longer,  so  more  research  is  necessary 
to  make  our  approach  viable  for  the  general  case. 
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